EN FR
EN FR


Section: New Results

Simulation

Simulation of vector architecture

Participants : Vania Joloboff, Yang Yu.

Many architectures including PowerPC and ARM now have vectorized instructions, that is, instructions that can execute on several data items in parallel (e.g 8 simultaneous additions) on specific vector data.

We have implemented the ALTIVEC extension of the PowerPC to support the vector instructions.

Native translation using LLVM

Participants : Vania Joloboff, Xinlei Zhou, Zuyu Zhang.

We have started to implement a new technique of dynamic translation. This new method consists in decompiling the binary object code into an abstract representation and recompiling it to native host code.

The decompilation of the program amounts to reconstructing the simulated program Control Flow Graph using an intermediate representation. We have chosen LLVM (Low Level Virtual Machine), defined by University of Illinois, and now widely adopted in many projects, as our representation format. Using LLVM allows us to directly use the LLVM Intel code generator.

The SimSoC binary decoder has been modified to identify basic blocks (blocks of sequential instructions ending with a branch instruction). After instructions have been grouped into basic blocks, they are translated into an LLVM representation and finally the LLVM compiler is called to generate native code.

A first version of this technique has been implemented for both the ARM and Power Architecture. We have reach a considerable speed improvement in the generated code, with the execution speed multiplied by factor of 2 to 8. However the translation time from binary to LLVM and from LLVM to native code is significant (translation speed is roughly 1000 instructions per second). Consequently the overall speed is improved by only a factor of 20 to 50 percent when the simulation are relatively short test programs [20] .

In order to reach still higher simulation speed we need to use a more sophisticated analysis of the control flow graph. The idea is to do an edge profiling analysis of the basic blocks in order to identify larger blocks. This work is under development.

Another idea is to use multi-processor hosts machine to parallelize translation from LLVM to native code. This is also under investigation.

Trace Analysis

Participants : Guillaume Merle, Vania Joloboff.

Simulation sessions produce huge trace files, sometimes now in hundreds of gigabytes, that are hard to analyze with a quick response time. This comes down to two sub-problems:

  • The trace file size. Trace files are huge because they include lots of information. But when looking for a specific problem, one does not need all of this information. To search one given defect, one may ignore a large amount of the data in the trace file. One would like the trace file to contain only relevant information to the concerned problem.

  • The expressive power of the language to analyze the trace, and its usability. If the language is limited to expression search, it is easy to use but hard to construct sophisticated formulas. If the language used is Linear Temporal Logic (LTL), there is a lot of expressive power but many engineers are unable to write a LTL formula and to maintain it over time.

We would like to build a trace analysis tool that includes a language which allows expression of time-related formulas but is simple to formulate expressions. When this language is compiled, ideally the compiler is smart enough to identify independent formulae, the search of which can be parallelized, and it is also smart enough to generate "filter scripts".

When compiling one trace language input file, it would generate, from one input file, N filter scripts and N analyzers. Then during the simulation, the huge raw trace file is actually split into N smaller trace files, each relevant to one problem only, filtering out all unnecessary data. Hence trace files sizes would be considerably reduced.

We have started to design a trace language and a compiler, and extended the SimSoC simulator to support generation of trace files with a filter.

A first version of the trace language compiler has been coded in OCAML.

In the current version under development, the filters are not generated but coded manually, and filters are not parallelized.

Generation of simulators from vendor specification

Participants : Frédéric Blanqui, Vania Joloboff, Jean-François Monin, Xiaomu Shi, Frédéric Tuong.

Starting last year, we undertook the task of generating automatically an instruction set simulator (ISS) from the vendor specification in a PDF file. In order to generate the C code of the simulator, it is assumed such vendor specification contains at least some formal definitions of the instruction set that can be analyzed. It is the case to a wide extent for the ARM, the PowerPC and the SH architectures.

The process of generating the simulator consists of 4 major steps, first eliminating from the PDF file irrelevant information, next construct from the relevant data an abstract syntax representation of the instruction set, then to generate the C code of the simulator, using some additional data provided manually to complete the vendor specification.

This work was completed last year for the ARM architecture with the documentation form ARM corporation [35] . This year, we did similar work for the SH architecture from specification from RENESAS corporation.

We have indeed generated a simulator for the SH4 architecture [31] , which has not been fully tested yet.

However, this works has proved that the abstract syntax we have defined is powerful enough to describe two different architectures with significant differences in the way they are described by the vendor.

First steps towards the certification of an ARM simulator

Participants : Frédéric Blanqui, Jean-François Monin, Xiaomu Shi, Frédéric Tuong.

The simulation of Systems-on-Chip (SoC) is nowadays a hot topic because, beyond providing many debugging facilities, it allows the development of dedicated software before the hardware is available. Low-consumption CPUs such as ARM play a central role in SoC. However, the effectiveness of simulation depends on the faithfulness of the simulator. To this effect, in [24] , we propose here to prove significant parts of such a simulator, SimSoC. Basically, on one hand, we develop a Coq formal model of the ARM architecture while on the other hand, we consider a version of the simulator including components written in Compcert-C [61] . Then we prove that the simulation of ARM operations, according to Compcert-C formal semantics, conforms to the expected formal model of ARM. Size issues are partly dealt with using automatic generation of significant parts of the Coq model and of SimSoC from the official textual definition of ARM. However, this is still a long-term project. We report here the current stage of our efforts and discuss in particular the use of Compcert-C in this framework.